[00:00.000 --> 00:04.220]  As you mentioned, we're going to talk about hijacking OAuth tokens in Google Cloud.
[00:05.060 --> 00:09.660]  Good thing about Safe Mode this year is if this is the wrong talk, you can quietly leave and I
[00:09.660 --> 00:17.660]  won't even see you. All right, let's get into it here. So, if you need to get hold of me,
[00:17.660 --> 00:21.820]  just DM me at Twitter. You can usually find me on most of the social platforms.
[00:22.060 --> 00:28.100]  What we're going to cover in this talk is hijacking of OAuth tokens, session tokens,
[00:28.100 --> 00:34.340]  for CLI access, API access, and it's really the ease of attack that we're going to focus on.
[00:34.340 --> 00:39.440]  There's additional use cases related to service accounts and compute instances
[00:40.060 --> 00:43.640]  that we'll look at as well. And from a defensive perspective,
[00:43.640 --> 00:48.240]  we're definitely going to drill into some of the challenges in securing the environment.
[00:48.820 --> 00:54.540]  Really, who cares, right? Well, OAuth is fundamental to Google authentication. It's
[00:54.540 --> 01:00.020]  underneath everywhere. There are some nuances depending on if you are authenticating with
[01:00.020 --> 01:08.100]  user accounts versus service accounts, whether your sessions are from browsers versus SDK access,
[01:08.100 --> 01:13.720]  whether you're external to the GCP environment versus internal, say, on a compute VM.
[01:13.720 --> 01:18.360]  And that is part of what leads to a lot of the opportunity from the tech perspective,
[01:18.360 --> 01:22.920]  as well as the challenges from the defensive perspective. Essentially, the session tokens are
[01:22.920 --> 01:28.860]  what the name implies. They're post-authentication representing a valid authentication session.
[01:28.860 --> 01:33.680]  And it's really an opportunity for us to hijack and use them and gain access to existing
[01:33.680 --> 01:40.960]  authenticated sessions. And if I, as an attacker, have access to an endpoint, say a GCP administrator's
[01:40.960 --> 01:46.460]  device, those cached accounts and environments are easily accessible to me. And that's what
[01:46.460 --> 01:51.940]  we're going to go through. At the end of the day, it's all about persistence, as well as
[01:51.940 --> 01:58.840]  evading detection from the attacker's perspective. Now, the defensive side,
[01:58.840 --> 02:08.040]  what is interesting are some of the challenges. Some of the controls such as MFA don't quite help,
[02:08.040 --> 02:12.820]  although you might counterintuitively, you might think they do, right? And we'll go through that.
[02:12.820 --> 02:18.780]  And in terms of securing the environment, there's a lot of options that are confusing,
[02:18.780 --> 02:24.240]  easy to misconfigure, and also easily misunderstood. So at the end of the day,
[02:24.240 --> 02:29.920]  we're talking about how do you handle incident response? What's your playbook for ops? Is it
[02:29.920 --> 02:40.100]  fully up to date with the most optimal measures or controls? So let's start with just a brief
[02:40.100 --> 02:46.660]  overview of OAuth. Okay, just really briefly, there's some verbiage on this slide, but focus
[02:46.660 --> 02:51.340]  on the right hand diagram. At a very high level, there's a simplified flow, no matter how you
[02:51.340 --> 02:59.500]  authenticate. First, you'll request a token, it'll be signed with your credentials, you'll request
[02:59.500 --> 03:05.700]  scopes or permissions. And as a user, you often see this when you first authenticate with gcloud
[03:05.700 --> 03:12.240]  or CLI. By default, a browser is launched, you have to go type in your Google credentials,
[03:12.240 --> 03:17.820]  you also have to approve the actual scopes being requested. And if that goes properly,
[03:17.820 --> 03:23.500]  then what is returned ultimately to the requester, in this case, it'd be the CLI,
[03:23.500 --> 03:29.420]  would be two important things, the access session token, that's an OAuth session token,
[03:29.420 --> 03:35.940]  as well as a refresh token. The access token is good for an hour. By default,
[03:35.940 --> 03:42.040]  you use it directly in API calls, they all take OAuth tokens. After the hour is up,
[03:42.040 --> 03:48.000]  you can refresh and get a new one. So gcloud, gsutil, CLI tools do this automatically,
[03:48.000 --> 03:54.860]  you usually don't see this level of the protocol in the handshake. The local language SDK libraries
[03:54.860 --> 04:00.320]  like Python, etc, also do this transparently. So generally, your code is not at this level.
[04:00.320 --> 04:04.540]  If you do direct REST API calls, you will see this handshake and have to go through
[04:04.540 --> 04:09.440]  two or three back and forths. The other important thing to remember about OAuth,
[04:09.440 --> 04:15.540]  besides this handshake and the access and refresh tokens, is that, well, you don't authenticate
[04:15.540 --> 04:21.340]  every time you use the CLI, right? It's cached. All session tokens almost everywhere in every
[04:21.340 --> 04:25.420]  model and architecture are cached. In this case, it's in the SQLite database and has
[04:25.420 --> 04:32.080]  big implications for this talk. So let's continue on and let's actually start talking about an
[04:32.080 --> 04:37.320]  attack scenario and we'll jump to the command line to demonstrate this. Okay, a bit of a busy
[04:37.320 --> 04:44.660]  diagram. Focus on what we're assuming is we have access to an endpoint device, a client device,
[04:45.360 --> 04:53.660]  where a GCP admin has been using the CLI, gcloud gsutil. What we're going to do is access that.
[04:53.660 --> 05:00.180]  Once we have access, we'll access that and look at the cached accounts, utilize it,
[05:00.180 --> 05:08.700]  access the GCP environment that the admin that user has previously accessed. Then we're going to
[05:08.700 --> 05:13.680]  copy the credentials, move it off board to a different host under my control, me, the attacker,
[05:13.680 --> 05:22.260]  and then we'll show how to access it from there. Okay, so I've got a split screen shell environment.
[05:22.260 --> 05:27.300]  Left is the target host. The right will be an additional attacker host. Let me explain a few
[05:27.300 --> 05:35.360]  things. This is video just recorded in the past day. I have obfuscated host names, IDs, a lot of unique
[05:35.360 --> 05:39.360]  things, but this was run against a live GCP environment. We're going through the attack
[05:39.360 --> 05:43.480]  scenario, but just a few notes on the defensive side, which will come into play in the second half
[05:43.480 --> 05:50.660]  of this presentation. MFA is enabled for the environments that we'll be looking at. Okay,
[05:50.660 --> 05:59.000]  just keep that in mind. And initially, it was a number of weeks since access was initially
[05:59.000 --> 06:06.180]  configured for these environments. So let's jump into it. If I have access to my target host,
[06:06.180 --> 06:13.800]  I can, of course, run gcloud commands as that user. And I can check with auth list what accounts
[06:13.800 --> 06:21.720]  are configured. Two accounts, user dev MFA, admin at prod MFA, named to give us a hint. One's a
[06:21.720 --> 06:26.580]  production environment. Others are dev. Both have MFA enabled. In fact, there's a hardware key
[06:26.580 --> 06:34.400]  on the prod environment. There's software MFA on the dev one. So I've got some comments along the
[06:34.400 --> 06:39.220]  way, but I'm going to highlight the bold white commands. All right, I can switch accounts with
[06:39.220 --> 06:45.440]  gcloud config set account. Let's switch the admin prod account. That's interesting. All right,
[06:45.440 --> 06:51.140]  switch is over. Let's see if I can actually access it. Let me list the projects. Boom. Let me list
[06:51.140 --> 06:57.320]  the buckets in the project. I get a bucket. Let me enumerate the bucket. Okay, let's pause here.
[06:57.320 --> 07:06.520]  Okay, if you're an admin, normally using gcloud, gsutil, this is boring. This is what you do every
[07:06.520 --> 07:10.900]  day. What makes this interesting is the attacker perspective within the context of the actual
[07:10.900 --> 07:16.600]  environments. As I mentioned, the environments have MFA enabled. And as we'll see as we get
[07:16.600 --> 07:23.380]  further along, these accounts were actually configured quite a while ago. The production
[07:23.380 --> 07:29.200]  account was added back in June, maybe six, seven weeks ago, right? So essentially, in this
[07:29.200 --> 07:35.640]  environment, this live environment, I have cached access to a production account. And as an attacker,
[07:35.640 --> 07:41.220]  it was easy, right? Just no authentication required six weeks later, seven weeks later,
[07:41.220 --> 07:48.260]  despite things like MFA. Let's continue on. All right, so what I want to do now is I want to
[07:48.920 --> 07:56.080]  exfiltrate this whole cache, move it off host, right? So what I'm going to do is go to the
[07:56.080 --> 08:03.140]  location of cache. It's in your home directory dot config in a gcloud subdirectory on Unix.
[08:03.140 --> 08:10.480]  Windows has its own corresponding directory. I'm going to tar it up, I'll SCP it off board,
[08:10.480 --> 08:16.140]  and then I'm going to switch to the right side. Okay, so focus on this other host, where I've
[08:16.140 --> 08:22.160]  SCP the tarball. And what I'm going to do is, so far, I'm just checking my environment.
[08:22.160 --> 08:27.900]  I haven't done anything yet. All right. I'm going to show that there's no accounts
[08:28.660 --> 08:32.840]  configured, right? It's empty, no credential accounts. I'm going to take that tarball,
[08:32.840 --> 08:41.620]  unpack it in my own dot config directory. And boom, right off. At least the configuration
[08:41.620 --> 08:49.260]  matches, right? Shows exactly the state. Now, the note to self right here is,
[08:49.260 --> 08:55.500]  that was easy. Way too easy. There's no device-specific encryption, obviously, or any kind
[08:55.500 --> 09:01.300]  of tagging in this data. So it transparently carries over, and I don't have to do anything
[09:01.300 --> 09:06.240]  except place it in the same directory. Let's see if it's actually useful and valid. I can switch
[09:06.240 --> 09:11.360]  accounts. Again, different host, same credential file copied over. I can switch accounts to that
[09:11.360 --> 09:19.820]  prod account. I can list the same production bucket, right? Way too easy, all right?
[09:19.820 --> 09:26.400]  The commands, nothing special in the commands. This is really a lesson in how easy it is to
[09:26.400 --> 09:33.240]  access and exfiltrate and reuse the credentials in bulk for CLI access, which is pretty much
[09:33.840 --> 09:40.900]  gives me everything, all right? So just to deepen our understanding, a variation of this attack
[09:41.460 --> 09:47.460]  actually makes use of the OAuth tokens in the gcloud subdirectory, all right?
[09:48.000 --> 09:52.260]  What we're going to do is go through a flow where we actually extract something from the SQLite
[09:52.260 --> 09:58.480]  databases holding the cached tokens. We're going to use that and perform some of the OAuth flow,
[09:58.480 --> 10:04.020]  which is request a new access token, because most likely that the last access token used has
[10:04.020 --> 10:10.320]  expired. It only lasts an hour. Get a new access token, which gives me another hour, and then start
[10:10.320 --> 10:21.940]  making API calls. Okay, so let's continue on, right? What I'm going to do, and this is going to get a
[10:21.940 --> 10:27.100]  bear with me as we go through this video and I highlight things, all right? In the gcloud subdirectory
[10:27.100 --> 10:33.460]  where my token cache is stored, there are two .db files. They're SQLite database files,
[10:34.260 --> 10:40.320]  unencrypted, nothing special. What I can do is do a SQLite command, all right? I'm not going over
[10:40.320 --> 10:47.640]  the schema particularly. There are sources for that, but there's a simple table that has the last
[10:47.640 --> 10:55.000]  used or last retrieved access token by account. So I'm pulling the one with associated with
[10:55.000 --> 11:01.220]  the production account. There it is, the YA-29. All right, we'll continue on. Now that I have
[11:03.360 --> 11:08.960]  that, I want to look at the other database, which is credentials.db. This has more information
[11:08.960 --> 11:15.180]  in a big JSON blob. I'm also going to look at what's associated with the production admin account
[11:15.180 --> 11:23.700]  here and I have more information. I have a client ID and secret, which is related to OAuth and the
[11:23.700 --> 11:29.100]  protocol, but it's needed if I want to perform the whole OAuth flow myself. I have the refresh token,
[11:29.100 --> 11:34.680]  so that's useful. I need that. Also, I have access token repeated. Just note that this information is
[11:34.680 --> 11:39.480]  from my initial authentication configuration of this production account. So technically,
[11:39.480 --> 11:46.280]  if you take this epoch time with this EXP field, you run it through date and convert it,
[11:46.280 --> 11:52.840]  you'll get a June 17th date. So what I said is this is a live environment configured six, seven
[11:52.840 --> 11:58.560]  weeks ago and I haven't changed it yet and had default security settings and seven weeks later
[11:59.080 --> 12:05.420]  still have access to it easily. All right, so let's move on. Basically, I want to get that
[12:05.420 --> 12:12.120]  refresh token, right? So I just did a little sedgrep foo on that same SQLite entry with the
[12:12.120 --> 12:18.580]  JSON above and pulled out that access token. Now I'm just going to do a little cURL REST API calls
[12:18.580 --> 12:25.320]  to go through the OAuth flow, all right? And the actual cURL command will be shown, but I'm using
[12:25.320 --> 12:32.060]  the refresh token. So let's see if I can pause this. Here's this cURL command right here in the
[12:32.060 --> 12:38.160]  screen. You can see that the client ID and secret and that refresh token are passed in as
[12:39.020 --> 12:46.140]  post parameters. I've got scope permissions I'm requesting. The cloud platform one is key,
[12:46.140 --> 12:54.240]  but some other IAM type scope as well. The endpoint is OAuth. And what I get back is a
[12:54.240 --> 13:02.220]  fresh new access token right here, okay? Now, good for an hour. Now, what can I do with that
[13:02.220 --> 13:08.420]  new access token? I can do real API calls. So let's continue on. I'm going to do another cURL
[13:08.420 --> 13:19.740]  command. Let's pause it once the output comes back. All right, here we go. So what did I do?
[13:19.740 --> 13:25.040]  Here's the cURL command at the top, right, with that fresh new access token. I happen to choose
[13:25.740 --> 13:32.200]  a fairly easy API, the storage API, and I'm using the same requesting listing of the same bucket
[13:32.980 --> 13:39.360]  that I had done in the first scenario. And the JSON that's returned shows that I'm accessing it,
[13:39.360 --> 13:47.740]  right? Same pass, credit card, sensitive bucket, so on. So all in all, it just shows that I can
[13:48.310 --> 13:55.120]  do a variation at a deeper level, actually perform the OAuth flow. And although functionally,
[13:55.120 --> 14:01.050]  this is, in terms of an ATT&CK perspective, it's not that different from the CLI since they're
[14:01.600 --> 14:07.420]  pretty much equivalent to doing direct API calls. This is more flexible in that I can
[14:08.540 --> 14:14.980]  spin up my own script and code and have somewhat tighter ATT&CK opportunities here.
[14:14.980 --> 14:21.740]  So the lesson from this is, either way, it's really easy to access the token cache and make
[14:21.740 --> 14:28.620]  use of it, and in a lot of cases, generate new tokens and access what environments have been
[14:28.620 --> 14:39.640]  cached. So if that's all there were to this, that might be more limiting. And really, though,
[14:39.640 --> 14:45.940]  there's some related areas with service accounts and compute instances that I want to talk about,
[14:45.940 --> 14:49.980]  right? More opportunities on the ATT&CK side, more hassle from the defensive viewpoint.
[14:49.980 --> 14:56.200]  We've been talking a lot about user accounts in the cache, and one might normally think that
[14:56.200 --> 15:04.460]  service accounts wouldn't really be that valuable to look at on a compromised host. Why? Because
[15:04.460 --> 15:12.760]  if the target host is external to GCP, most likely what's happened is a service account
[15:12.760 --> 15:18.820]  key file has been generated when you generated a key, and that's been downloaded. And that JSON
[15:18.820 --> 15:25.800]  file can then be used within API code, but it can also be used within the CLI. We can activate and
[15:25.800 --> 15:31.200]  add that to our gcloud config and then switch to it as an account and start executing
[15:31.200 --> 15:37.180]  commands as that service account. But because a key file is very likely still on that client,
[15:37.180 --> 15:40.820]  that's a better target for me as an attacker, right? Because it's a full-blown credential and
[15:40.820 --> 15:46.220]  I can do anything with that, as opposed to having a temporary OAuth token that expires in an hour.
[15:46.220 --> 15:52.060]  However, as we'll see later on the defensive side, it is useful to grab this whole cache,
[15:52.060 --> 15:57.080]  including what would be the OAuth session tokens for the service accounts,
[15:57.080 --> 16:01.440]  if they've been configured, because it gives us persistence. We'll find that a lot of
[16:02.040 --> 16:07.980]  remediation commands that we try with service accounts disable the service account, but they
[16:07.980 --> 16:14.640]  leave the tokens valid, meaning it's a way to have additional access, even though the defender is
[16:15.420 --> 16:20.680]  remediating and revoking the service account access. All right, so for that reason alone,
[16:20.680 --> 16:25.840]  there's this variation where OAuth tokens for service accounts on an external client are still
[16:25.840 --> 16:33.280]  valuable to grab from an attack perspective. Speaking of service accounts, well, if they,
[16:33.280 --> 16:39.620]  you know, compute instances run as a service account, to be more secure, there's no key file
[16:39.620 --> 16:45.820]  there if you want to run code on that VM as that service account. What you do is you query the
[16:45.820 --> 16:52.920]  metadata service and you get an OAuth token that your scripts or code can use, and that way you
[16:52.920 --> 16:59.840]  have no key file that's lying around to be compromised. So that's an easy REST call. The
[16:59.840 --> 17:06.840]  URL is internal, resolves to a non-routable local IP. What we get back is an access token
[17:07.560 --> 17:12.620]  at which we can use in a second API call, as an example of a storage call again.
[17:12.620 --> 17:23.840]  What does that look like? Sorry, it's weird. Okay, sorry, getting a little click
[17:25.220 --> 17:30.240]  problems here. What does that look like? Well, you can see the curl command, okay?
[17:30.620 --> 17:35.680]  What it returns is this access token highlighted in blue, looking at the right side, right?
[17:35.800 --> 17:38.540]  The expiration date is a little bit less than an hour, basically. I'm sort of
[17:38.540 --> 17:45.260]  midstream, 400 or so seconds, and I'm getting the cache token. But at the end of the hour,
[17:45.260 --> 17:50.640]  it automatically refreshes and I can query it again and get a new token. I can then use that
[17:50.640 --> 17:57.680]  token in a second API call in the storage bucket, boom, it lists it, okay? So I don't even have to
[17:57.680 --> 18:03.500]  worry about caches, etc. I can always access the service account OAuth token if I have compromised
[18:03.500 --> 18:07.880]  the compute instance. And this is really common, right? Compute instances are targets. Once you
[18:07.880 --> 18:13.840]  have access, this is one of the first things that we should look at as attackers and worry about as
[18:13.840 --> 18:19.200]  defenders. Speaking of compute instances, if you have your own compute instances inside the GCP
[18:19.200 --> 18:24.520]  environment, and you install Google Cloud SDK there, everything we've talked about applies
[18:24.520 --> 18:32.740]  there, right? Makes perfect sense. Because you have cached tokens as well. So that becomes a
[18:32.740 --> 18:39.360]  target on the tech side. But what's maybe not so obvious is that our GCP-managed compute instances,
[18:39.360 --> 18:43.060]  if you run Cloud Shell, the whole purpose of that is sort of ease of use. You don't have to worry
[18:43.060 --> 18:48.420]  about a VM or the SDK. Well, the SDK, the latest version is always installed for you. You now have
[18:49.360 --> 18:54.360]  a nice up-to-date environment to do your CLI commands and manage your environment as a
[18:54.360 --> 18:58.460]  defender. Well, from an attacker perspective, that becomes a target as well, because the SDK is
[18:58.460 --> 19:03.060]  installed. There's been some good work, and I recommend if you're interested in this particular
[19:03.060 --> 19:11.080]  area, read Juan Berner's post a year and some ago on backdoors with Google Cloud Shell. Basically,
[19:11.080 --> 19:18.720]  you can install a backdoor shell, you know, one-liner, you can exfiltrate the token cache
[19:19.480 --> 19:24.400]  automatically. So every time someone starts up the Cloud Shell, you get the fresh sort of token dump.
[19:25.820 --> 19:30.920]  And you can move on your merry way. One note about Cloud Shell. Well, how do you access that?
[19:30.920 --> 19:34.540]  You can do it through the Google console, but you can also do it through a gcloud command,
[19:34.540 --> 19:38.740]  right? So you can think that this could be a chain command. If I compromise your endpoint,
[19:38.740 --> 19:43.740]  that's my target outside of GCP, and you use Cloud Shell, I may be able to get into your
[19:43.740 --> 19:49.320]  Cloud Shell environment, install a backdoor, and then continue to have access from this
[19:49.880 --> 19:56.320]  compute instance sort of environment. So this complicates things, right? These are
[19:56.320 --> 20:02.300]  all sort of variations. We're not just talking about your endpoint where you normally do Google
[20:02.960 --> 20:10.980]  Cloud Admin. All right, we're about halfway through. Let's focus to the defensive side.
[20:10.980 --> 20:17.120]  We have a lot to talk about here. Basically, from a prevention viewpoint, put on your blue hats.
[20:17.120 --> 20:23.520]  There are three basic things that I think will help. It is hard, right, to prevent credential
[20:23.520 --> 20:28.860]  compromise. You really can't. How do you know it's compromised versus the normal user? But we can
[20:28.860 --> 20:34.000]  mitigate it. So the number one thing to do is expiration time. IP whitelisting, if you can
[20:34.000 --> 20:40.820]  manage that, is helpful, as well as MFA. And there are some additional topics regarding IP
[20:40.820 --> 20:45.900]  whitelisting that we need to touch base on. Okay, so let's talk about cloud session duration.
[20:45.900 --> 20:53.360]  On the G Suite admin side, which is, you know, the flag already. Anytime I say G Suite admin,
[20:53.360 --> 20:58.980]  and you're a GCP admin, your ear should sort of perk up. It's a different console. If you're
[20:59.080 --> 21:03.820]  a bigger organization, that may be a different team. If you have access to it as a GCP admin,
[21:03.820 --> 21:08.900]  great. This will be familiar to you. But it's a beta feature, about a year old, if I'm correct
[21:08.900 --> 21:14.840]  on that. So people may not know about it. This is one of the most useful things. By default,
[21:14.840 --> 21:22.180]  sessions for SDK access do not expire, never expires. In fact, in the earlier attack scenario,
[21:22.180 --> 21:27.480]  in the prod environment we went into, guess what the default was set to? The default was never
[21:27.480 --> 21:33.060]  expires. It was never changed. You can set it to one to 24 hours. Recommend really short,
[21:33.060 --> 21:39.040]  like one hour for prod environments, maybe eight plus hours for your dev environments, right?
[21:39.040 --> 21:48.120]  When it expires, all access, both console and programmatic, all those sessions are revoked,
[21:48.120 --> 21:54.940]  and anyone accessing the environment has to actually re-authenticate, all right? So this
[21:54.940 --> 22:01.680]  shrinks the window during which you can be compromised. Now, there is a re-authentication
[22:01.680 --> 22:06.100]  method. You have two choices. When they're re-authenticating after the expiration,
[22:08.000 --> 22:11.920]  a password or security key. This is hardware security key. The only thing to note here is
[22:11.920 --> 22:18.780]  if you use something in the middle, like a software MFA, which I have on one of my dev
[22:18.780 --> 22:23.860]  environments, you have to pick a lesser, a less strong method, which is password.
[22:24.280 --> 22:30.320]  Just a note, it means your re-authentication is less secure because it's no longer MFA,
[22:30.880 --> 22:37.180]  and opens up to keyboard logging and losing a password credential.
[22:38.840 --> 22:42.440]  Let's talk about IP whitelisting, the second thing on my list,
[22:42.440 --> 22:48.660]  all right? So still a relatively new feature, but you can do VPC service controls
[22:49.560 --> 22:57.440]  to implement IP whitelist. In the Access Context Manager in Google Cloud, you create a whitelist,
[22:57.440 --> 23:03.000]  specify a bunch of ciders and set up a policy. Please don't be thrown that there's a 1918
[23:03.000 --> 23:08.180]  address in here. That's my obfuscation in play. This is normally your public IP
[23:08.940 --> 23:14.640]  or cider that you want in your whitelist. Switching over to the VPC service control
[23:14.640 --> 23:22.540]  section, you can specify the APIs and services, everything as an example, attach and reference
[23:22.540 --> 23:28.760]  the actual access level with the whitelist IPs, put it in place. And the next time that someone
[23:28.760 --> 23:36.640]  tries to do a command to the resources inside that perimeter, VPC perimeter, they'll get an error,
[23:36.640 --> 23:43.640]  right? So a different host, different IP, it's not in the whitelist. All good, right? Really good.
[23:46.680 --> 23:51.380]  What's of interest to note is you get sort of a verbose message. Not sure what I think about
[23:51.380 --> 23:58.040]  that from a defensive side, sort of unsettles me, but it's clear that it works. There is more
[23:59.140 --> 24:04.020]  complication when you think about IP whitelisting. It's not just users and what devices they're
[24:04.020 --> 24:11.800]  coming from or what IPs rather. You can manage that, try to enforce VPNs, space that you know,
[24:11.800 --> 24:19.160]  proxies, but you also have to worry about VM instances. And here's where it's harder lifting.
[24:19.160 --> 24:24.900]  Essentially, if you go IP whitelisting, you want to make sure that your VM IPs are also whitelisted
[24:24.900 --> 24:31.200]  so that any API calls that they do are allowed, right? And so there's a couple things to worry
[24:31.200 --> 24:36.320]  about here. It's just the maintenance issue. You could hook into the startup and provisioning of
[24:36.320 --> 24:42.560]  your VMs to make sure that the VMs IP addresses are included automatically in your IP list,
[24:42.560 --> 24:50.660]  especially if you have a lot of VMs or dynamic IPs. Detection, it's not prevention, but a side
[24:50.660 --> 24:55.860]  note, could be done based on logs instead of an explicit whitelist, right? And some of the
[24:55.860 --> 25:01.700]  references below sort of talk about that as a technique, right? So IP whitelisting sort of
[25:01.700 --> 25:09.360]  alternatives are cull from logs, the IPs of VMs that have been started and see if there's anything
[25:09.360 --> 25:16.800]  abnormal in it compared to a whitelist or instead of a whitelist. You can hook in and put in place
[25:17.000 --> 25:23.400]  a man-in-the-middle proxy for metadata, a metadata service, which would give you visibility to anyone
[25:23.400 --> 25:28.660]  querying it for OAuth tokens. And that might be useful to track and determine and see if that's
[25:28.660 --> 25:35.740]  worth monitoring and learning on. Okay, so this is also a bit of a detection piece. But the main
[25:35.740 --> 25:41.680]  point here from a prevention perspective is you have to think about VM instances and ensure that
[25:41.680 --> 25:45.600]  they are whitelisted properly if you go down that route. So there is some work to be done, but if
[25:45.600 --> 25:54.320]  you can accomplish it, then IP whitelisting is very effective at helping you mitigate the issues
[25:54.320 --> 25:58.600]  with any kind of compromised credentials. And I'm going to do a little shout out to
[25:59.440 --> 26:03.280]  the Netflix security team, and I have some references there. And they've blogged quite a
[26:03.280 --> 26:08.180]  while about the work they've done in AWS. A lot of their concepts apply over to GCP and are worth
[26:08.180 --> 26:14.260]  thinking about as we go. Once you have the whitelist in place, of course, you have to make sure it's
[26:14.260 --> 26:20.920]  enforced. So you can do a little CLI foo. And on the right, you can see the definitions of your
[26:20.920 --> 26:26.440]  access levels and policies and check the IPs and make sure they're the right ones. You can check
[26:26.440 --> 26:33.260]  the VPC service parameters and make sure that they're accessing access levels as well. So
[26:33.820 --> 26:38.640]  it's all well and good to have a whitelist, just make sure it's actually enforced and put in place.
[26:38.920 --> 26:45.720]  Let's move to the third area of prevention, MFA. On the G Suite admin side, yellow flag,
[26:45.720 --> 26:51.520]  if you don't have access, is where MFA is controlled. Set it as much and enforce it
[26:51.520 --> 26:58.080]  with your admin users. It'll help mitigate that compromise of credentials during any
[26:58.080 --> 27:08.440]  re-authentication. Okay, let's move to detection. This is a shorter discussion, unfortunately.
[27:08.440 --> 27:16.260]  Behavioral detection is really the problem for compromised credentials. It's difficult,
[27:16.260 --> 27:23.880]  discern differences from valid user actions and behavior unless they do something egregious,
[27:23.880 --> 27:31.320]  like exfiltrate some terabytes. I'm not going to talk about that. I want to talk about if you do
[27:31.920 --> 27:37.820]  or are able to implement IP whitelisting effectively, effectively with low false positives,
[27:38.740 --> 27:44.620]  detecting failed authentications is worth doing, right? And Stackdriver on the right will get a
[27:44.620 --> 27:52.240]  nice entry that we can key off of and monitor. And if we have low enough false positives and
[27:52.240 --> 27:58.360]  it's maintained, it's a good way to get a heads up that there may be a compromised credential
[27:58.360 --> 28:04.000]  being used from an IP. This would cover the scenario where I exfiltrate credentials or
[28:04.000 --> 28:10.960]  tokens and try to use this from an IP that I control, right? The old style way of tacking
[28:10.960 --> 28:15.440]  would be to always do that. If people focus on whitelisting from a defensive viewpoint,
[28:15.440 --> 28:19.640]  it would be interesting to see if, from a tech perspective, we did more follow-up work from the
[28:19.640 --> 28:26.580]  compromised host, because that would still be allowed in this scenario. So we're not dealing
[28:26.580 --> 28:33.020]  with perfection. We're just trying to improve our ability to detect. Okay, we've got to spend some
[28:33.020 --> 28:40.900]  time on remediation, okay? This is complicated. Let me see if I can get through this. There's a
[28:40.900 --> 28:44.640]  basically remediation. I think there's two major things we have to worry about. How do we lock out
[28:44.640 --> 28:49.420]  the account, prevent future access? How do we revoke current access, the current sessions,
[28:49.420 --> 28:56.160]  which are the OAuth access tokens and refresh tokens? User and service accounts are different.
[28:56.920 --> 29:03.380]  So here are some of the commonly available or known options. Let's go through them one by one.
[29:03.420 --> 29:07.940]  It's going to be a little bit of remediation bingo, all right? As we go through this,
[29:07.940 --> 29:12.680]  see if you can pick out or suggest the best option, and I'll tell you mine.
[29:13.860 --> 29:17.740]  All right, user accounts. How do we lock out user accounts? Well, G Suite Admin,
[29:17.740 --> 29:23.400]  okay, G Suite Admin, we can reset password, suspend the user, and reactivate it later,
[29:23.400 --> 29:28.340]  delete the user. They're all effective. They all lock out the user or attacker. We'll pick
[29:28.340 --> 29:35.000]  the lowest impact one. Resetting password, pretty common, works. That's the answer.
[29:35.880 --> 29:43.720]  How about service accounts? Similar, parallel set of options. I can delete the key here. I'm
[29:43.720 --> 29:48.120]  in Google Cloud Console for service accounts, right? G Suite Admin for user accounts,
[29:48.120 --> 29:53.020]  Google Cloud for service accounts. I can disable the service account itself,
[29:53.020 --> 30:04.980]  and I can delete the service account. Which one? They all work. We'll pick the lowest one. Rotate,
[30:04.980 --> 30:09.940]  right? Now, let's get to the tough part. The tough part is revoking current access,
[30:09.940 --> 30:17.660]  and that's the topic of this talk, right? So, let's see of these six suggested options,
[30:17.660 --> 30:24.500]  the last three are the lockout account measures. The first three are potentially common ones that
[30:24.500 --> 30:31.100]  people try or think about. Let's go through them one by one. In G Suite Admin, for a particular
[30:31.100 --> 30:37.500]  user, their security section, there is a sign-in cookies and a reset action. What does this do? It
[30:37.500 --> 30:45.040]  revokes all web sessions immediately. This includes Google Cloud Console. It would also include G
[30:45.040 --> 30:53.840]  Suite, okay? This works. This is good. Include any Google app and service. This is great,
[30:53.840 --> 30:58.180]  except it doesn't do anything about SDK sessions. So, actually, it's only half of a
[30:58.180 --> 31:04.320]  solution. So, we lose. Let's keep going. There's an interesting G Cloud CLI command that you have
[31:04.320 --> 31:11.080]  probably seen or used, auth revoke. It takes an account name, and it will revoke both the access
[31:11.080 --> 31:18.040]  and refresh tokens associated for that user account. It, underneath, calls an API call with
[31:18.040 --> 31:24.360]  this endpoint to revoke the specific token. That also works if you want to call it directly.
[31:24.360 --> 31:29.940]  Sounds great, doesn't it? But it's in the middle of my list, so you know that it actually doesn't
[31:29.940 --> 31:36.620]  work. Here's the problem. It can only be run on the G Cloud client machine. Why? Because you pass
[31:36.620 --> 31:42.820]  in an account name, and it looks in the SQLite database cache to find the token. And that is
[31:42.820 --> 31:51.640]  easily manipulated by the attacker, at a minimum, deleted, right? So, it can't be run anywhere else.
[31:51.640 --> 31:55.780]  Frankly, I don't think any incident responder would trust a compromised client machine for
[31:55.780 --> 32:05.940]  anything, right? Similarly, calling the API itself doesn't help. Why? You need to pass in
[32:06.520 --> 32:14.620]  an OAuth access token or revoke, sorry, refresh token. Those are not logged. They're not accessible.
[32:14.620 --> 32:18.940]  They don't exist anywhere except on the client machine, which is compromised, which is easily
[32:18.940 --> 32:26.140]  manipulated by the attacker. So, it's useless. These are for self-help, for you to clean up your
[32:26.140 --> 32:32.140]  cache and config, nothing else. A one side note, which is a diversion, but the call to revoke token
[32:32.140 --> 32:37.760]  doesn't require authentication. You just pass in this long access and refresh token. It's long
[32:37.760 --> 32:41.180]  enough that you probably can't brute force it, but it's an interesting note that you don't need
[32:41.180 --> 32:52.500]  OAuth to call. Okay, let's cover that lockout service account, sorry, user lockout options,
[32:52.500 --> 32:58.800]  all right? Changing the user password. These all don't work. They don't revoke SDK sessions.
[32:58.800 --> 33:04.680]  There's a narrow case where if you have requested scope for Gmail, then resetting the password would
[33:04.680 --> 33:10.900]  work, but that doesn't apply to us. We would have requested scope probably for just gcloud
[33:10.900 --> 33:18.020]  or SDK access. Suspending user account works temporarily. While the user account is suspended,
[33:18.020 --> 33:23.780]  the tokens will not work. Once, though, it's re-enabled, the tokens will start working again,
[33:23.780 --> 33:28.900]  assuming they have not expired. So, that's not a great solution. Unless you suspend the account
[33:28.900 --> 33:37.660]  for an hour, that would work. Sorry, let me backtrack on that. Deleting user account works,
[33:37.660 --> 33:42.860]  but has high impact, okay? So, these options aren't great. So, this is all yellow. The question
[33:42.860 --> 33:51.880]  is, what's our best option? So, one thing I did not list in the table was there's actually the best
[33:51.880 --> 33:58.940]  option that works and is low impact is to delete the connected application. This is the connected
[33:58.940 --> 34:04.160]  OAuth application in the G Suite admin side. Basically, every time you authenticate and
[34:04.160 --> 34:10.320]  request access, the OAuth application, in this case, we're talking about the Google Cloud SDK,
[34:10.320 --> 34:15.720]  is listed here, as well as this access level, and you can delete it. And deleting it has no
[34:15.720 --> 34:22.080]  big ramifications. It just means next time someone accesses Google Cloud SDK, like through CLI,
[34:22.080 --> 34:26.180]  they're forced to re-authenticate. But once you delete it, at that point in time, all session
[34:26.180 --> 34:33.400]  tokens are revoked. So, it's the answer. That's the bingo. That's the bonus right there, okay?
[34:33.400 --> 34:37.680]  And if you haven't worked with G Suite admin, you wouldn't know about it, or maybe think about
[34:37.680 --> 34:42.780]  it, or if you're not an OAuth jockey, right? In OAuth land, this is pretty common to think about
[34:42.780 --> 34:46.940]  these connections. It just means it's trusted. And once you delete it, it revokes the sessions.
[34:47.180 --> 34:54.580]  All right. Let's talk about the service account. How do we revoke that? Well, do we revoke it? Do
[34:54.580 --> 35:01.100]  we do some of these other lockout measures? Let's jump into it. Well, you might have guessed that
[35:01.100 --> 35:05.900]  the revoke CLI and the API call won't work, because we just went through it for the user
[35:05.900 --> 35:11.340]  account. It definitely doesn't work. You actually get errors here, right, when you try to do the
[35:11.340 --> 35:15.820]  CLI or the API. Outright errors. It just says it doesn't apply. But what's of note is, look in the
[35:15.820 --> 35:20.600]  error, sorry for the verbiage, but look at the error you get back from the CLI from gcloud.
[35:20.660 --> 35:25.380]  It actually hints in the last sentence that there's hope. If you revoke the parent service
[35:25.380 --> 35:32.120]  account, it's supposed to revoke the session, the actual sessions. What does that mean? Well,
[35:32.120 --> 35:37.400]  if you suspend it or delete it, that's true. But the suspend is sort of a half answer,
[35:37.400 --> 35:41.580]  because once you re-enable it, the tokens actually still work. What about the service
[35:41.580 --> 35:47.500]  account key? Revoke it? What's that mean? Well, it means delete the API key. That definitely
[35:47.500 --> 35:53.560]  doesn't work. That's a bug. Bug has been filed. Don't believe it, right? We've already gone
[35:53.560 --> 36:00.540]  through this. This doesn't work. How about the measures to rotate and delete the API key? Look,
[36:00.540 --> 36:06.840]  if you rotate, delete the API key, you can still use OAuth tokens and still access the service
[36:07.480 --> 36:13.600]  login and do actions as the service account. If you disable it, at that point in time,
[36:13.600 --> 36:19.180]  the tokens don't work, but they still exist. So once you re-enable, reactivate the service account,
[36:19.180 --> 36:24.600]  they'll work again. So the solution there is you could suspend for an hour, then re-enable it,
[36:24.600 --> 36:28.840]  because in that hour, tokens will expire. How about deleting service account? Works,
[36:28.840 --> 36:37.600]  just like deleting user account, high impact. So the summary is actually disable for an hour
[36:38.140 --> 36:43.740]  or delete and reprovision. You will choose whichever you're more comfortable with.
[36:43.920 --> 36:48.820]  Whichever gets you back up and running quicker, that you can perform with confidence,
[36:48.820 --> 36:55.240]  without making errors. This isn't great, but there are ways forward, right? And the main thing of
[36:55.240 --> 36:59.940]  this whole slide and the time spent on all the paths that weren't great paths or were dead ends
[36:59.940 --> 37:03.440]  is to make sure we don't waste time on that, especially in the middle of an incident, right?
[37:03.440 --> 37:11.740]  Is to feed this and review our operations playbooks and make sure that we have the right actions for
[37:11.740 --> 37:20.420]  remediation for the different types of accounts. So to wrap up in the next minute or so and at the
[37:20.420 --> 37:25.260]  end, we went over a bunch of attack scenarios, right? Definitely user accounts and the cache
[37:25.260 --> 37:29.780]  on an external client. That's really what we focused on. But don't forget that service accounts
[37:29.780 --> 37:35.980]  also cache the access token and that an attacker would want to grab that because it helps them
[37:35.980 --> 37:41.400]  maintain persistence in the light of all those remediation actions that don't quite work.
[37:41.740 --> 37:48.280]  But it's not just external clients. If you switch to a Compute Engine environment and you've installed
[37:48.280 --> 37:53.800]  SDK, all the same things apply. And it's not just things that you explicitly start. Remember to
[37:53.800 --> 38:00.300]  think about Cloud Shell, which starts up a VM in the back with the latest SDK installed, right?
[38:00.300 --> 38:09.600]  These are all having the same cache token kinds of exposure and opportunity. And fifth, last but
[38:09.600 --> 38:17.600]  not least, the fifth area is even if SDK is not installed, right? Compute Engine compromise,
[38:17.600 --> 38:23.480]  OAuth tokens for that service account are easily queried from the metadata service and can allow
[38:23.480 --> 38:28.980]  the attacker access as well as persistence for that access. And from a defensive measure,
[38:28.980 --> 38:34.660]  real quick summary after that long sort of part of the presentation, do three things on the
[38:34.660 --> 38:40.200]  prevention side. The no-brainer is set your session timeout in G Suite. If you can, implement
[38:40.200 --> 38:46.900]  IP whitelisting with service controls and set your MFA so that these all mitigate credential
[38:46.900 --> 38:52.160]  compromise in some way. And if you do do that IP whitelisting, then you can improve your detection
[38:52.900 --> 38:57.380]  by detecting the failed authorizations. From a remediation perspective for user accounts,
[38:57.380 --> 39:02.820]  reset the password to lock it out and delete the OAuth connected app in G Suite. That's the key
[39:02.820 --> 39:07.800]  one that may not be obvious to everyone. For service accounts, pick your poison. One of two,
[39:07.800 --> 39:13.480]  disable reactivate within our time period or delete, recreate, reprovision. If you have that
[39:13.480 --> 39:19.860]  automated, just make sure and want to get up and running faster. Just have to worry about VMs,
[39:19.860 --> 39:26.600]  compute instances, anything using those service accounts. So that is the end and I will sort of
[39:26.600 --> 39:32.380]  ask Jaron to see if we have any time for questions. Otherwise, you can always reach me
[39:32.380 --> 39:42.300]  directly as well. Yeah, we have one question from Eric. Do you know of any blogs or white papers
[39:42.300 --> 39:49.400]  that people can use to replicate the OAuth token hijacking that you showed earlier?
[39:51.880 --> 40:04.180]  Yes, so there are some resources for sure. We will be blogging over the next week or two a lot of the
[40:04.180 --> 40:10.120]  details from this presentation. So you'll actually be able to recreate what's seen in the video
[40:10.840 --> 40:17.120]  with actual commands. But there's been prior work as well. So if you search on G Cloud,
[40:17.510 --> 40:22.530]  I want to say if I have the name right, I didn't reference in the presentation,
[40:22.980 --> 40:28.540]  I'll have to add that later. But John Haney has, I believe I have his name right, has done some
[40:28.540 --> 40:34.320]  early work on just peeling apart the SQL databases and some of the REST API calls.
[40:34.620 --> 40:41.180]  And that's a useful resource. It's not documented well, particularly, but it's also not hard to
[40:41.180 --> 40:48.740]  delve into it. So both our blogs should be released in part, I think today, as well as
[40:48.740 --> 40:52.920]  over the next two weeks, as well as there's some prior work if you search on some of the keywords
[40:52.920 --> 40:57.640]  like the G Cloud, the dot config, the SQL light database names.
